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In time for tlie first tests on LHC data we introduce a set of improvements and tests of purely kine- 
matic top tagging algorithms. First, we show how different jet algorithms can be used for different 
transverse momentum regimes. Combining pruning and filtering in the reconstruction can enhance 

I the signal over background ratio significantly, while larger jet radii only give minor improvements. 

I Finally, bottom tagging can be added to the top tagger, but at least for the HEPTopTagger does 

not improve the kinematic selection algorithm. 
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I. INTRODUCTION 



The top quark, found in 1995 T], is the heaviest and so-far only observed fermion with a weak-scale mass. 
Therefore, it is expected to have strong ties to the mechanism which triggers electroweak symmetry breaking. 
^ ^ Searches for new physics in the top sector are of high priority because they can shed additional light on the 

,— — I structure of the Standard Model at and above the weak scale. Many extensions of the Standard Model, like 
supersymmetry or little Higgs models 2 , predict top partners to ameliorate the loop-induced effect of the 
top quark on the Higgs-boson's mass. Typical signatures for such extended top sectors include top partners 
^ ^ decaying to a top quark and missing energy [3H6] or heavy resonances decaying to two often strongly boosted 
Q top quarks [71 [S] . 

During the Tevatron's final years DO and CDF have measured several anomalies directly related to the top 
sector: in CDF's single top analysis the ratio between the s- and i-channel production rates deviates by 2.5ct 
I from the SM prediction |9], and both experiments measure an enhanced tt forward-backward asymmetry 

^ compared to SM predictions TU. Also including the excess in the di-jet invariant mass spectrum of the Wjj 

"T^ final state, measured (only) by CDF [11 , all these anomalies show that we need an improved understanding 

and simulation of top production processes |12| . 

The top pair production rate at the LHC ranges around one million tops per inverse femtobarn of integrated 
^^~\ luminosity. On the one hand, this means that top pair production is a very challenging background for 

^-H searches relying on high multiplicity final states of jets, leptons and missing energy [5H51 [TB]. On the other 

hand, this means that already now we can test top pair events in many different kinematic regimes. In this 
] paper we will focus on moderately boosted top quarks in the semileptonic decay channel. 

While the idea of studying the substructure of jets is already a classic [TJ, the potential for searches of 
k>( massive Higgs and gauge bosons has only been appreciated recently [ISHTT] . In this paper we will focus on 

j_j tagging boosted top quarks 5> 18 "26^. Aside from being sensitive probes of new physics they are also the 

prime candidates to generally establish that fat-jet or subjet methods work at the LHC. Some very promising 
ATLAS results on the HEPTopTagger performance on data can be found in Ref. [27]. CMS has 

already released first search results using a top tagger [28]. Of the Tevatron anomalies listed above the top 
forward-backward or charge asymmetry is particularly interesting in the light of boosted top quarks; the 
ratio of initial state quarks vs gluons increases in the boosted regime, thereby enhancing the otherwise small 
asymmetry at the LHC [^ . 

Starting from the default purely kinematic setup of the HEPTopTagger we investigate several avenues 
on how to improve its performance: In Sec. IH] we discuss how well the momentum of subjets matches the 



decay partons in the default setup and which strategies for an improvement should be promising. In Sec. Ill 
we investigate the performance of different jet algorithms for the filtering and subjet reconstruction. In 
Sec. |IV| we then study the tagging performance if we include pruning in combination with filtering. The 
pruned top mass we use as an additional kinematic variable according to Ref. |30| . In Sec.|v]we investigate 
the possibility to enlarge the size of the fat jet to i? = 1.8, focussing on the currently most relevant low-p^ 



tops. Finally, in Sec. VI we augment the kinematic top tagger by a 6-tag inside the fat jet [31]. Two possible 
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strategies are simply adding the 6-tag at the end of the top tagging algorithm or including it in a modified 
extraction of the relevant subjets. 



II. SUBJET-PARTON RECONSTRUCTION 



Before we can suggest and test improvements to our top tagger it is crucial that we study measures for 
the quality of the top reconstruction. The geometrical distance between the reconstructed and the true top 
momenta is simply 

ARi^ = AR\pl:f'^''y,:;'n- (1) 

For a more detailed study we also compute the geometric separation of the top decay products, which 
requires a proper definition of the parton and jet level constituents. Jet combinatorics is the main challenge, 
in particular for hadronic top pair production at the LHC. For example including up to two additional hard 
QCD jets the tt sample consists of 6 to 8 partons which we label as (1,4) for the bottom quarks, (2,5) for 
the harder W decay partons, (3,6) for the softer W decay partons, and (7,8,...) for additionally radiated 
partons. The partons 1-3 and 4-6 come from one top decay each. After hadronization and jet reconstruction 
the corresponding b, Wi, and W2 subjets are defined such that Wi (harder) and W2 (softer) reconstruct mw 
best. Note that we do not apply any ^-tagging, an issue we will look at in Sec. IVl) We can then define 



mappings 



where the label jt denotes the z-th hardest parton in the tagged top. The best parton-subjet mapping 
{ji} — {ji) J2, ja} is defined by the minimum Ai?^(pf"'^,p^^'*°") value. For example, for {ji} ~ {1, 5, 7} 

the hardest subjet corresponds to a 6-quark, the second hardest to a decay from the other top, and the 
softest an additional parton from jet radiations. This way we can categorize all tagged tops into three types: 

type 1: {ji,j2ii3} come from one top decay, i.e. {1,2,3} or {4,5,6}. 

type 2: only the two hardest {ji, j2} come from one top decay, has a different origin. 

type 3: else. 



For semileptonic top pairs the combinatorics is simplified significantly, but we can still categorize all 
tagged tops along the same lines. Our analysis is based on Alpgen-Pythia [32l [S^ samples with MLM 
merging [M] (i?MLM ^ OA,p^^^ > 30 GeV) and we cluster the visible final state using Fast Jet [35]. We 




Figure 1: Left: Ai?aum as defined in Eq.(|2|; center: Ai?top as defined in Eq.([TJ; right: Apy^'/pr- The colors represent 
type 1 (red), type 2 (green), type 3 (blue) for semileptonic top pairs at a collider energy of 14 TeV. 
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Figure 2: Relative pr differences between subjets and the corresponding partons Apr/pr for the &-subjet (left), the 
VKi-subjet (center), and the W2 subjet (right). Again, the different colors show type 1 (red), type 2 (green) and 
type 3 (blue). 

take into account two hard jets in association with tt production and three to five hard jets for W+jets and 
QCD jets. The ti sample we re- weight to 918 pb ^36j. The left panel of Fig. [l] shows Ai?suin for each type. 
The quality of the reconstructed subjets direction is the same for all types, with the exception of long type 2 
and type 3 tails. In the central and right panels of Fig. [T]we test the actual top momentum reconstruction 
in terms of Ai?top and Ap^^'/p^'' = (p^f^°'^ — p^^j'°")/p^f^'"^. Its quality depends on the different types of 
parton identification and the most poorly reconstructed candidate tops are of type 2 and type 3. Unlike for 
type 1 tops their distributions do not follow a largely Gaussian shape centered at zero but show a significant 
shift. 

Fig. [2] shows the transverse momentum difference between each subjet and the corresponding parton 
Apt/pt — {p^^^'^^ — p^"*™)//)^''^"'. We see that the subjet momentum reconstruction is essentially of the 
same quality for all subjets and for all types, with the exception of which is better reconstructed than 
because of its larger value. 

In Tab. |l] we give the fraction of tagged tops for each of the three types after different requirements on 
the quality of the reconstruction. For type 1 tags the momentum reconstruction both for subjets and tops is 
almost perfect. For type 2 and type 3 tags, a large fraction of tagged tops satisfies AfJsum < 0.4, which means 
the individual subjets are reconstructed well but the set of the partons is wrongly picked. Consequently, 
the top momentum reconstruction for type 3 tags becomes worse. Thanks to a correct assignment for the 
hardest two subjets in type 2 tags the top momentum reconstruction is not too bad because the wrong third 
subjet does not contribute much to the top momentum. 

From the discussion above we can conclude that the individual subjet-parton momentum reconstruction 
works well for all types. The limitation to the top momentum reconstruction arises from events where some 
of the identified subjets do not correspond to a top decay product. From Tab. |l] we estimate that 0(20%) 
of type 3 tops and C(50%) of type 2 tops still give the correct top momentum within 15%. The fraction of 
tagged tops with good momentum reconstruction within each type does not depend much on pT,t] however. 





all pli?'"'"'' 


^tagged ^ GcV 




tagged A7?sum < 0.4 AJJtop < 0.2 


ApJ^" 

PT 


< 0.15 


tagged ARsnm < 0.4 AiJtop < 0.2 


Apl^" 

PT 


< 0.15 


total 
type 1 
type 2 
type 3 


14156 11904 (84%) 10841 (77%) 11037 (78%) 
10318 9531 (92%) 10102 (98%) 9897 (96%) 
1336 896 (67%) 477 (36%) 623 (47%) 
2503 1478 (59%) 263 (11%) 517 (21%) 


6029 5279 (88%) 5191 (86%) 5170 (86%) 
4919 4624 (94%) 4858 (99%) 4774 (97%) 
412 273 (66%) 218 (53%) 244 (59%) 
698 381 (55%) 115 (16%) 152 (22%) 



Table I; Tagged top rates (in fb) in the semileptonic tt sample for 14 TeV collider energy. The percentages for the 
additional cuts are relative to the numbers of tagged tops in each row. 
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the fraction of type 2 and type 3 tags in all tagged tops decreases for higher pT and effectively leads to a better 
momentum reconstruction. In total, 85% of all tagged tops and up to 90% of all tags with p^®^"'^ > 250 GeV 
reproduce the true momentum within a 15% error bar. 

An assigned top tagging efficiency should describe which fraction of hadronic tops in any event sample are 
tagged. Such an efficiency we define step by step: 

1. all decay products satisfy Rc/a < 1-5. 

2. all decay products appear in a fat jet, i.e. there exist unfiltered subjets with Ai?(pP^''*°",p*'"'^''°*) < 0.4. 

3. all decay products appear in a fat jet with a top candidate fulfilling 150 < fn^ff < 200 GeV. 

4. all decay products appear in a fat jet with a tagged top, i.e. after the mass plane cut. 

For the first step we use Rc/a — niax{i?i, R2} based on the two C/A measures for the necessary clustering 
steps i?i,2- All these events are defined for the signal, so we can show them as a function of the true pT,t at 
parton level. 

The left panel of Fig. |3] shows the event fractions corresponding to all four categories normalized by the 
number of hadronic tops as a functions of px- First, the dotted entries show the fraction of tops with 
-Rc/A < l-5i while the solid entries show the fraction of top decay products inside a fat jet. For any fat jet 
analysis they fix an upper bound on all tagging efficiencies. 

The red dotted and solid entries show the fraction of candidate and tagged top events, all constrained to 
type 1 tags. The difference between the two is simply given by the mass plane cuts. After these cuts roughly 
30% of all hadronic tops are tagged above px ^ 250 GeV. Note that there exist type 2 and type 3 candidates 
and tags, i.e. hadronic tops whose decay products are included in a fat jet but the extracted subjets do not 
fall into type 1. 

The right panel of Fig. [3] first shows the fraction of type 1 candidates and type 1 tags relative to the 
number of fat jets including all three top decay products, corresponding to the second category in the above 
list. Type 2 and type 3 are shown in green and blue. The difference between all top decay products inside 
a fat jet and type 1 candidates is about 25% and almost constant for 200 < pr < 500 GeV. It is partly 
(at most 10%) due to the existence of type 2 or three candidates, which means the tagger wrongly selects 
subjets even though the fat jet does include all decay products. Alternatively, there might be overlapping of 
subjets such that any three subjets are inconsistent with the top mass constraint. The difference between 
candidates and tags corresponds to the mass plane cuts, where some signal loss is inevitable for rejecting 
QCD and VF+jets backgrounds. The numbers of type 2 and type 3 tags are negligible; most of the type 2 
and type 3 tags shown in Tab. |I] come from fat jets not including a top. The mass plane cuts efficiently 
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Figure 3: Left: tagging efficiencies normalized to the number of hadronic tops as a function of pt- Right: tagging 
efficiencies relative to the number of hadronic tops included in a fat jet. The different curves are discussed in the 
text. 
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remove such contributions. Hence, we see that once the top decay products are captured in a fat jet, 50% 
to 60% of the tops are tagged. 

The rapid drop of efficiency below ~ 250 GeV in the left panel of Fig. [3] is simply due to the rapid drop 
of the fraction of decay products within Rq/a < 1-5 and inside the fat jet. The ratio of type 1 tags to decay 
products within a fat jet is not small for 150 < pr < 250 GeV. Therefore, we will test an increased R value 
in Sec. M 

An obvious question from the right panel of Fig.[3]is why the tagging efRciency degrades for larger p^, where 
in principle the tagging performance should tend to increase. This effect of the mass plane cut is caused 
by a mis-reclustering after filtering and is more pronounced for the C/A algorithm. The C/A algorithm 
relies on the R distance exclusively, therefore the softest two of the five filtered subjets are not necessarily 
combined according to their shower history. As a result, we find unbalanced invariant masses from the three 
re-clustered subjets. This happens more frequently in the high-p^ regime where all five subjets are not 
well separated, so the rejection probability by the mass plane cut increases with px- This tendency and 
the possibility of changing the underlying jet algorithm and its effect on mass plane cuts we will discuss in 
Secini 

To include backgrounds and mis-tagging efficiencies we need to define a slightly different set of scenarios, 
namely as a function of the fat jet pr- We define four scenarios similar to the ones before, but now in terms 
of fat jets: 

1. fat jets with three subjets or more after the mass drop criteria 

2. fat jets where all distances between top decay products and their closest subjets are less than 0.4. 

3. fat jets with a top candidate 

4. fat jets with a top tag 

The left panel of Fig. |4] shows all corresponding fractions relative to the number of fat jets as a function 
of pt- The black dotted and solid symbols show the fraction of events with at least three subjets and of all 
top decay products included. More than half of the fat jet events with at least three subjets do not include 
the top decay products, even for the semileptonic ti sample. The red entries show the fraction of candidates 
(dotted) and tags (solid). Because all efficiencies are shown as a function of the fat jet pt, we can also show 
the backgrounds in blue and green. On the plateau we find tagged tops in roughly 20% of the fat jets for 
semileptonic top pairs and 2-4% for M^-|-jets and QCD jets. 

The candidate histogram in the left panel of Fig. |4] exceeds the numbers for top decay products in the fat 
jet because there exist fat jets whose three main subjets are not from a top decay and accidentally give mj. 
The central panel of Fig. |4] shows the composition of each type separately for candidates (dotted) and tags 
(solid). Indeed, a considerable fraction of candidates are of type 3. On the other hand, type 3 and type 2 
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Figure 4: Left: efficiencies e^t, ew+jcts eqcd as functions of the fat jet pr- Center: fraction of tagged tops for type 1 
(red), type 2 (green) and type 3 (blue). The dotted lines show the corresponding candidate fractions. Right: fraction 
of type 1, type 2 and type 3 only in for fat jets including a top. 
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tags are effectively rejected by the mass plane cut, so most of the tagged tops are of type 1. The fraction of 
type 2 and type 3 tags ranges around 2-4%, similar to QCD and W+jets backgrounds. 

The right panel shows the fractions of candidate and tagged fat jets including a hadronic top, so they are 
also constrained to belong to the second category. Compared with the central panel we are now less likely to 
encounter type 2 or type 3 candidates. Most tagged tops are now of type 1 and most type 2 and type 3 tags 
correspond to fat jets which do not include a hadronic top. Consequently, we find that there is not much 
room to improve our algorithm in selecting subjets after applying the mass drop criterion. 

III. ALTERNATIVE JET ALGORITHMS 

The combination of the Cambridge-Aachen clustering algorithm [37] with a mass drop criterion [T3] is a 
core feature of the HEPTopTagger and hence not negotiable. However, in the mass reconstruction after 
filtering the C/A algorithm should be compared to alternative jet algorithms, like kx [38] or anti-fcr [39] . 
After identifying three subjets based on the mass drop criteria there are two steps left to extract the b, Wi, 
and W2 subjets: filtering [TS] and reclustering. We find that the choice of jet algorithm for the filtering does 
not have a visible effect on the efficiency at particle level, while for the reclustering step it does. In our 
explanations we therefore focus on effects of this reclustering which combines five filtered subjets into three 
top decay subjets while we always use the same jet algorithm for filtering and for reclustering. 

Our first result is that the anti-Zc^ algorithm fails to reliably identify the three hard top decay products. 
It tends to first recombine a pairing with large transverse momentum, such that of the three reclustered 
subjets one is very hard and two are very soft. Applying our W and top mass cuts will typically reject such 
unbalanced combinations. 

Using the C/A algorithm a similar problem arises, but only for very large PT,tj where the five filtered 
subjets are close. According to QCD the two softest filtered subjets should then be merged into the main 
three subjets, which the C/A algorithm achieves as long as the three main subjets are geometrically well 




atan(m|^/mp) atan(m^^/m|2) atan(m^^/m|2) 

Figure 5: Mass plane for top candidates in tt, W^+jets and QCD events (left to right). The upper three panels use 
C/A reclustering while the lower three panels use kx reclustering. 
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Figure 6: Left: signal efficiencies relative to the number of hadronic tops for the kr (red), C/A (black) and anti-fcr 
(with C/A clustering, blue) algorithms. Center: kr tagging efficiency relative to the number of tops included in a fat 
jet for type 1 (red), type 2 (green), and type 3 (blue) tags. Right: mis-tag rates for QCD (blue) and W+jets (green) 
for the C/A (dotted) and kr algorithm (solid). 



separated. Once the geometric distances become small the probability to correctly reconstruct the three main 
subjets decreases. Such signal events appear in the lower left corner of the (arctanmi3/TOi2 vs. ^7123/^123) 
distributions [S] shown in Fig. [sj Events close to oi-axis have 77123 ^ which usually means P3 ^ for the 
third-hardest decay subjet. 

The kx algorithm recombines soft filtered subjets most reliably, so it can resolve the three main top decay 
products best. We can see this in the lower panels of Fig. [5] where hardly any signal events migrate to 
TO23 ~ 0. As a result, the signal efficiency of passing the mass plane cut increases. This tagging efficiency we 
show in Fig. [6j For both cases the filtering and reclustering algorithms are the same. In the \aw-pT regime 
the difference between the two algorithm is indeed small. In the central panel we see that unlike the C/A 
results in Fig. [Sjthe fc^ efficiency hardly decreases towards high PT,t, i-S- the efficiency of the mass plane cut 
is essentially constant. The fractions of type 2 and type 3 tags which include a top also decrease because 
the subjet momentum reconstruction improves. If the subjets are better matched to the hard top decay 
products more tagged tops are categorized as type 1. 

Going beyond the signal, the same feature of fewer events with 77123 ~ also appears for the backgrounds. 
The mass plane cut then leads to a less efficient rejection in particular for soft masses. This increase of 
mis-tag probabilities we also show in Fig. |6] The solid crosses show the results for the kx algorithm while 
the dotted crosses show those for the C/A algorithm. Quantitatively, the mis-tag efficiencies increase by a 
slightly larger factor than the tagging efficiency, so switching to the kr algorithm does not improve S/ B but 
can improve S/y/B slightly. In addition, these results are obtained without detector simulation and pileup. 
Switching to the kx algorithm might have additional implications from both of them, so only a detailed 
experimental analysis can determine which jet algorithm to use in the HEPTopTagger framework, and its 
result might well be process dependent. 



IV. PRUNING 



From the Higgs tagger based on the C /A algorithm and a mass drop criterion [15j we know that it can be 
advantageous to combine filtering and pruning in the tagging procedure [SU] . Pruning removes soft radiation 
while clustering the fat jet [201 [21]: first, a sequential jet algorithm combines unfiltered subjets until no pair 
of constituents is geometrically closer than i?cuti representing an effective subjet cone size usually associated 
with an intrinsic fatjet scale. We choose i?cut = This cutoff can act differently for different 

underlying jet algorithms. After this recombination the unfiltered subjet merging continues, but with the 
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Figure 7: Left: Am'"^""'^ distribution for type 1 tags in tt (blaclc), IV+jets (green), and QCD (blue) events. Center: 
signal efficiencies as a function of pT,t without (blaclc) and with (red) the pruning cut Am'"^""° < 20GeV defined in 
Eq.([5|. Right: mis-tag probabilities for QCD (blue) and W+jets (green) as functions p!^* with (solid) and without 
(dotted) the pruning cut. 



additional restriction that each combined pair of subjets has to be sufficiently hard 

m.m{pT,i,PT,j) 



z 



\VT,i + Vt,„ 



> ^CUt ) (3) 



where we choose Zcut =0.1. Otherwise, the constituents i and j are not combined and the one with smaller 
transverse momentum is discarded. This algorithm continues until all constituents have been combined or 
eliminated. 

There are two ways we can include pruning in our top tagger. First, we can prune the fat jet before we 
run the C/A algorithm extracting the relevant splittings using the mass drop criterion. Alternatively, we 
can use the pruning procedure in parallel to filtering procedure and combine the two pieces of information. 
This approach ensures that the additional pruning step does not affect the performance of the rest of the 
tagging algorithm, so we investigate it in this section. 



Since pruning is originally targeted at removing soft radiation, its impact is similarly to that of filtering |15j . 
To quantify the difference we apply pruning to the constituents of the three subjets which wc obtain after 
the usual mass drop criterion. These three subjets are selected such that they give the best filtered top mass 
among all combinations. The difference between the two algorithms is illustrated by the variable 





tagged 


/^^prune ^ -^g gO 30 


^^unhlter ^ ^5 jO 30 


tt [fb] 


14156 


7773 (55%) 9072 (64%) 10875 (77%) 


6237 (44%) 7926 (56%) 10505 (74%) 


type 1 
type 2 
type 3 


10318 
1336 
2503 


6255 (61%) 7152 (69%) 8316 (81%) 
551 (41%) 693 (52%) 893 (67%) 
967 (39%) 1227 (49%) 1666 (67%) 


5141 (50%) 6377 (62%) 8129 (79%) 
403 (30%) 570 (43%) 847 (63%) 
693 (28%) 979 (39%) 1529 (61%) 


W+jet [fb] 


6590 
1 
1 


2716 (41%) 3373 (51%) 4459 (68%) 
1.33 1.25 1.14 
0.86 0.9 0.93 


2052 (31%) 2797 (42%) 4162 (63%) 
1.41 1.32 1.18 
0.79 0.86 0.93 


QCD [pb] 


1229 
1 
1 


359 (29%) 474 (39%) 719 (59%) 
1.88 1.66 1.31 
1.02 1.03 1 


207 (17%) 331 (27%) 609 (50%) 
2.62 2.08 1.5 
1.07 1.08 1.05 



Table II: Tagged top rates (in fb for tt and IV+jets and pb for QCD jets) after cuts on AmP''""'= or Am"""'*". The 
percentages are relative to the numbers without pruned or unfiltered mass cut in each category. es/B, ^s/^TS denote 
improvement factors relative to no cuts. 
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where 

^filter jg 

the filtered mass for the selected three subjets, and m^''""'' is the jet mass of the pruned jet. 
The left panel in Fig. [t] shows AtoP''""° for tagged tops in tt (type 1), QCD jets, and M^+jets events. We 
find that AtoP''""'' is larger for background events than for signal events. Pruning generally collects more 
constituents than filtering, which discards some of the filtered subjets, so the pruned mass increases in a 
busy jet environment. Background events rely on such busy events to mimic the generically hard top decay 
products. Therefore, selecting tags with small AmP''""'' effectively rejects the backgrounds. Even though 
we do not show them, type 2 and type 3 tags behave similar to the backgrounds samples, because QCD jet 
radiation partly contributes to these tags. Thus, pruning also purifies the type 1 fraction for all tagged tops. 

The event numbers after imposing 

- 10 GeV < AmP™"'= < {15, 20, 30} GeV (5) 

we show in the left half of Tab. The percentages are relative to the number of tagged tops in each row. 
The different efficiencies we can translate into improvement factors for S/B and for S/ V~B. For example, we 
can improve S/B by roughly a factor two without loss of S/ y/B. In addition, the quality of the momentum 
reconstruction improves with the increased fraction of type 1 tags. 

This additional cut becomes more important for larger thresholds of the subjet mass in the top tagger. 
For example, compared with the default choice msubjct > 50 GeV a reduction to 30 GeV makes the cut on 
^j^prune jggg efficient. This arises because with a larger msubjet threshold more constituents can contribute 
in the background case. 

According to the above discussion the difference in the pruned mass distribution for signal and backgrounds 
is due to the subjet multiplicity inside the fat jet. A simpler intuitive measure for this feature is the fat 
jet mass before filtering. Algorithmically, we select the relevant three subjets only after filtering, but the 
original mass of the unfiltered subjets includes additional information: 

^^unfllter ^ ^unfiltcr _ ^filter (g) 

In Fig. [8] we show this distribution for signal and backgrounds. Indeed, the left panel is very similar to the 
^^pruno (distribution of Fig. [7) The two-dimensional correlation confirms that almost all events for signal 
and background lie on the central diagonal of the AmP''""'' vs Am""*^'*°'' plane. 

In the right half of Tab. |ll] we show the corresponding efficiencies after cutting on Am"'^''"°''. From both 
variables we can obtain significant improvements on S'/B, and it remains an experimental question which of 
them is more stable once we include detector effects and pile-up. 



V. FATTER JETS 



In the Standard Model the cross section for top pairs falls very steeply with increasing transverse mo- 
mentum. The fraction of top pair event above different p™'" values at a 14 TeV LHC we calculated using 
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Figure 9: Left: tagging efficiency (solid red) and the fraction of tops included in the fat jet for R = 1.8 (solid) and 
for R = 1.5 (dotted) as a function of pT,t for semileptonic tt events. Center: tagging candidates and tags relative to 
all tops included in a fat jet. Type 2 and type 3 tags are shown in green and blue. Right: mis-tag rate as a function 
of fat jet pt for QCD (blue) and VF+jets (green). 
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14% 


6.8% 3.4% 0.96% 0.33% 



Extending the top tagging reach by 50 GeV towards smaller pT,t corresponds to doubling the number of 
accessible top pairs. 

The question becomes where the observed limitations of top tagging in this regime really come from and 
whether these constraints can be removed. From the left panel of Fig. [3] we know that the fraction of hadronic 
tops which can be included in a fat jet rapidly drops around pT,t = 200 — 250 GeV. Compared to a well suited 
data sample with pT.t > 300 GeV the tagging probability roughly drops to half its value. On the other hand, 
in the right panel of Fig. [3] we see how the fraction of tagged tops relative to the number of hadronic tops 
included in a fat jet increases. This suggests that larger values than R — 1.5 should significantly improve 
the tagging efficiency around 200 — 250 GeV. In this section we will study an increase to i? = 1.8 for our 
standard HEPTopTagger setup to test such an option. 

The tagging efficiency for i? = 1.8 as a function of pT,t as well as the fraction of hadronic tops included in 
a fat jet we show in the left panel of Fig.|9] By increasing R from 1.5 to 1.8 we increase the tagging efficiency 
for tops with pT,t = 150 — 250 GeV by a factor 1.5 to 3. This is mainly an effect of more hadronic tops fully 
included in the fat jet. In the 300 — 450 GeV range the effect of an increased R is small, and for p-^.t > 450 
the efficiency even slightly decreases due to combinatorics. 



R = 1.8 


tagged [fb] 


/^^prune ^ ^5 gO 30 


^^unhltor ^ ^5 20 30 


tt [fb] 


27853 


10695(38%) 13221(47%) 17453(63%) 


8253(30%) 11131(40%) 16148(58%) 


type 1 
type 2 
type 3 


17502 
3628 
6723 


8114(46%) 9716(56%) 12195(70%) 
934(26%) 1252(35%) 1847(51%) 
1647(24%) 2252(34%) 3410(51%) 


6463(37%) 8403(48%) 11507(66%) 
655(18%) 996(27%) 1666(46%) 
1135(17%) 1732(26%) 2975(44%) 


W+jet [fb] 


16920 
0.77 
1.23 


4274(25%) 5791(34%) 8551(51%) 
1.16 1.06 0.95 
0.94 1 1.08 


3063(18%) 4521(27%) 7620(45%) 
1.25 1.15 0.99 
0.86 0.95 1.06 


QCD [pb] 

^S/B 


4402 
0.55 
1.04 


644(15%) 936(21%) 1627(37%) 
1.44 1.23 0.93 
1.04 1.07 1.07 


337(8%) 584(13%) 1279(29%) 
2.13 1.65 1.1 
1.11 1.14 1.12 



Table III: Numbers of tagged tops with R = 1.8 with several cuts on AmP'^""^ or Am""*'"^^ The percenta ges are 
relative to the numbers of tagged tops without pruned or unfiltered mass cut in each category. es/B and £5/^ denote 
improvement factors relative to the i? = 1.5 numbers with no cuts as shown in Tab. [HI 
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The central panel shows the fractions of type 1 candidates and type 1 tags relative to all hadronic tops 
included in a fat jet as functions of hadronic top pT- All the way down to pT,t — 100 GeV the efficiencies are 
flat, which means we have a fair chance to collect very moderately boosted tops. Type 2 and type 3 fractions 
we show at the bottom of the figure. While the fraction of type 2 and type 3 tags increases the fraction of 
type 1 candidates and tags does not drastically change compared to i? = 1.5 as shown in the right panel of 
Fig. [3} 

Finally, increasing R also increases the mis-tag rate significantly. The right panel of Fig. [9] shows the 
mis-tag efficiency as functions of the fat jet pr for QCD and W^-Kjets events. We observe a larger increase 
for these background processes than for the signal, which means we will not improve S/B through larger jet 
sizes. However, we might improve the statistical significance measure S/ \/B. 



In Tab. 



Ill 



we show the improvements in S/B and S/yB relative to i? = 1.5 case. We see that roughly 
twice the number of tops get tagged, mainly at low transverse momenta. However, S/B still decreases by a 
factor 1/2, while S/\/B slightly improves as long as QCD is the main background. 

To compensate for the increased backgrounds we can also apply pruning for R = 1.8. The corresponding 
efficiencies for different pruned mass cuts we also include in Tab. |III| Adding pruning shifts back the per- 
formance to a similar level as our i? = 1.5 results, and there is no obvious advantage in combining it with 
larger fat jets, unless there should be a specific reason to target the low-pT,t regime. 

VI. BOTTOM TAGGING 

A major difference in the background rejection between the C/A based Higgs tagger [TS] and corresponding 
top taggers are additional 6-tags. Only based on kinematic conditions it appears unlikely to achieve a QCD or 
iy-|-jets rejection of more than a factor 1/100. However, we can gain a significant improvement by requiring 
a 6-tag for one of the the top decay jets. At the same time, for moderately boosted top quarks the kinematic 
tagging algorithm might benefit from the identification of the 6-jet, so we can first ensure that it is captured 
and second use this information in the kinematic reconstruction. 

Our first attempt to improve top tagging through an additional 6-tag will leave the kinematic top tagging 
algorithm unchanged and will instead focus on the selection of the subjet which should be 6-tagged. All of 
the usual top taggers treat three subjets democratically, i.e. without any 6-tagging information. We label 
the 6, Wi and W2 subjets such that Wi and W2-subjets (ordered by hardness) reconstruct mw best; ji,, 
jwi, and jw2 are the corresponding parton labels from Monte Carlo truth, as defined in Sec.|ll] An obvious 
question is for what fraction of all tags the subjet labeled b really points to the bottom quark b — jb. For 
type 1 tags by definition one of the subjets corresponds to a bottom, so 6 = jb implies that all subjets are 
correctly assigned. For such tags a W decay angle analysis [23] will work well. 

In the first column of Tab. |IV| we summarize the probabilities to correctly assign the label 6 in the 
kinematics-based top tagging. The percentages are defined relative to the number of tagged tops in each 
category. For type 1 tags there will be exactly one &-parton in the tagged top while for for type 2 and type 3 
tags there can be any number of 6-partons. This means that only for type 1 tags all three fractions sum to 
100%. Almost 80% of type 1 tags correctly assigns 6 — jt, so they make a good test sample for the question 
if identifying b — jb through a fr-tag leads to an efficient rejection for VF-|-jets and QCD events. From Tab.jlV] 
we also see that the second-most likely subjet to be kinematically identified as a & really is jwi- However, 
this probability is less than a third of that for b = jb- To rely on an additional &-tag for the case b = jwi 
does not even improve S/y/B, so the only way to increase S/B is to apply a 6-tag only to the kinematically 
identified jb- 

As we show in Fig. [5] a significant fraction of events passing the mass plane cut have two pairs of subjets 
consistent with the W mass. This happens because the upper bound m^^ < — is numerically close 
to [T2]. We expect that tagged tops where only one pair of subjets is consistent with mw should more 
reliably give b = jb- To test this, we count the number of subjet pairs consistent with mw as nw = 2 when 
Ato2 — Ami < 10% X m^r where Am = \mjj — mw\ for each pairing. Because n\y = 3 essentially never 
appears we find nw = 1 for the other top tags. 



The distributions for kinematically identified 6-subjets for different nw we also show in Tab. IV The 



fractions of tagged tops with nw = 1 is almost the same (~ 57%) for tt and QCD, W^-|-jets. Type 1 tagged 
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tops with nw = 1 are more likely to give b = ji, than those with nyi/ ~ 2, while for type 2 and type 3 we do 
not observe a significant difi'erence. 

Consequently, the best strategy to improve S/ B in top tagging is to target nw = 1 events and check the 
kinematically identified &-subjet with a 6-tag. In terms of a tagging eflticiency et and a mis-identification 
rate e^"' for light flavors (we ignore c quarks in this simple estimate) we show the possible improvements in 
Tab. \V\ We expect an enhancement by 0.66eb/e™'^ (0.73eb/£™"' for selecting nw = 1) for S/B, so assuming 
Eb — 50% and e™"' = 2% we find that S/B improves by 16.5 (18 for selecting n^y = 1). Similarly, we find an 
improvement in S/ \/B around a factor 2. 

As an alternative to an external &-tag, we can use the 6-tagging information during the kinematic top 
tagging algorithm. The selection of the three decay subjets after the mass drop criteria we modified in four 
steps: 

1. group the subjets into either 6-tagged or non-6-taggcd subjets. 

2. take all possible triplets of one 6-tagged and two non-6-tagged subjets and select the one with the best 
filtered top mass. 





all nw 


nw = 1 


nw — 2 


tt [fb] 


14156 


8058 


6099 


b = jh 
b = jwi 

b = jw2 


9325 (66%) 
2971 (21%) 
1666 (12%) 


5882 (73%) 
1242 (15%) 
833 (10%) 


3442 (56%) 
1728 (28%) 
833 (14%) 


type 1 


10318 


5808 


4509 


b = jh 
b = jw\ 
b — jw2 


7917 (77%) 
1695 (16%) 
706 (7%) 


5044 (87%) 
502 (9%) 
263 (4%) 


2874 (64%) 
1193 (26%) 
443 (10%) 


type 2 
b = jb 
b = jwi 
b = jw2 


1336 

565 (42%) 
499 (37%) 
392 (29%) 


781 

341 (44%) 
294 (38%) 
226 (29%) 


555 

224 (40%) 
205 (37%) 
166 (30%) 


type 3 


2503 


1468 


1035 


b = jb 
b = jwi 
b = jw2 


842 (34%) 
777 (31%) 
568 (23%) 


498 (34%) 
447 (30%) 
344 (23%) 


344 (33%) 
331 (32%) 
224 (22%) 


VF+jet [fb] 
QCD [pb] 


6590 
1229 


3733 
713 


2857 
516 



Table IV: Kinematic identification probabilities (b) for all three top decay partons for different types of top tags and 
for different numbers of subjet pairings consistent with mw (nw)- 
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3. check if this m'^"'^'' satisfies our criterion. 

4. apply modified mass plane cuts. 

Since we require exactly one 6-tagged subjet in step 2, we adapt the labels 6, Wi W2 to reflect this. In step 4, 
we should change the invariant masses in terms of (pi,P2,P3) into {pb,Pwi,PW2)- Our two-dimensional mass 
plane becomes arctan(rnb2/'Tihi) vs. mi2/mi23, where m^^ = {p}, + Pwi)'^ and = {Pwi + PW2Y ■ Fig. 10 



shows the two-dimensional distribution of top candidates in the modified mass plane for semi-leptonic tt 
events, ly-f jets, and QCD events. Unlike before, the signal events now show a clear W mass peak only for 
mi2. To see how well this new algorithm might do we assume a 100% 6-tagging efficiency of perfect purity. 
Because for the backgrounds all three subjets could equally likely be mis-tagged we simply reweighting each 
of the possibilities by ■ Following these plots we can apply stricter mass plane cut on the modified mass 
plane than before. For our test we use 



mi2 mw 



"1123 rrit 



f mb2 



< 15% and 0.2 < arctan — < 1.3 . (7) 

\mbi' 



Based on this modified algorithm the left panel of Fig. 11 shows the signal efficiency as a function of PT,t- 
Again, for the signal we assume a perfect 6-tag. We can simply multiply by Sb to compute the final top 
tagging efficiency, ignoring the small effect of £'°"^. Similarly, we ignore cases with two 5-subjets in the tagged 
top as subleading by a factor £b{l — £b) and only appearing for type 2 or type 3. 

Taking into account the probability of 77% with which the default top tagger correctly assigns b — ji,, the 
modified algorithm slightly increases the number of tagged tops if we assume perfect 6-tagging. It returns 
almost the same tagging efficiency as the purely kinematic tagger before adding the factor Sb- This is because 
the default top tagger identifies the crucial type 1 tops even without 6-tagging information while type 2 or 
type 3 configurations are comparably rare. The use of 6-tagging really only helps to identify which subjet of 
a type 1 tag corresponds to the bottom and to reject type 2 and type 3 tags. To confirm this, in the central 



panel of Fig. 11 we show the efficiency as a function of the fat jet's px- Compared to pT,t this effectively 
removes some type 2 and type 3 tops. 

The right panel of Fig. |11| shows the mis-tag rate of the modified algorithm. Here we simply assume that 
one of the three subjets found by the kinematic top tagger is mis-identified. These candidate tops have to be 
multiplied by 3 x e™"' before imposing the modified mass plane cut. The effect of mis-identifying 6-subjets 
among others than the three subjets selected by the kinematic tagger we neglect. It should be less than 10% 
given that more than 90% of the fat jets in which three subjets fulfill the top mass criteria only have one 
such combination. 

The main question is if the restricted mass plane cuts now reduce the backgrounds more efficiently. We find 
that the cuts from Eq.Q efficiently drop tagged tops from background samples as compared to democratic 
mass plane cuts, but they do not compensate for the combinatorial factor 3 in the mis-tagging probability. 

Up to this point we have only considered semi-leptonic ti events. In particular for the possible improvement 
through 6-tagging we should consider the more combinatorics prone purely hadronic decays. The left panel 



of Fig. 12 shows the tagging efficiencies as a function of px.t for the fully hadronic tt sample. Compared to 





tagged 


b = jb 


b = jb 


and nw = 1 


6-tag with cut for mi2 


tt 


1 


0.33 


(omsb) 


0.21 


(0.42£6) 


0.41 


(0.82ei,) 


type 1 


0.73 


0.28 


(0.56ei,) 


0.18 


(0.3666) 


0.36 


(0.72ei,) 


VK-l-jet 


1 


0.02 




0.0114 


(0.57er°) 


0.0318 


(1.59er) 


QCD 


1 


0.02 




0.0116 


(0.58er') 


0.0332 


(1.66er ) 




1 


16.5 


{0.66eb/eri 


18.42 


{0.72eb/£ri 


12.35 


{0A9eb/eri 




1 


2.33 {0.66eb/^/ef^) 


1.95 {0.55£b/^/ef^) 


2.25 {O.GAeb/^/e'^''') 



Table V: Efficiencies of a 6-tag after top tagging and for the modified tagger with 6-tagging. We assume Sb ~ 0.5 and 
e™'" = 0.02 and quote the improvement factors ^s/b,^s/\/b against the QCD background. 
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Fig. [3] the tag ger works with ahiiost the same efficiency, provided we normalize the number of tagged events 
to all hadronic tops. 

The central panel of Fig. [12] shows the default top tagger efficiencies as a function of the transverse 
momentum of the fat jet, again for hadronic decays. The tagging efficiency and the fraction of candidate 
tops we show for type 1, type 2 and type 3 tags. Since the number of fat jets including a hadronic top 
shown as the black solid line increases compared to semi-leptonic events the resulting candidate and tagged 
efficiency becomes larger. The number of type 2 and type 3 tags increases simply with the jet multiplicities. 
In the right panel of Fig. [12] we show the same fractions as in the central panel but requiring that all fat jets 
include a hadronic top. For such fat jets we find type 2 or type 3 almost as rarely as in the semi-leptonic 
sample, i.e. type 2 and type 3 candidates contribute at most 30% relative to type 1 candidates. This means 
that even for the fully hadronic tt sample a modified algorithm including 6-tagging will not provide enough 
of an improvement to compensate an expected factor 50% increase in the mis-tag rate. 

In summary, using 6-tagging when selecting the relevant three subjets in a fat jet does not enhance S/B. 
The purely kinematic HEPTopTagger selects the correct set of subjets too reliably to gain a significant 
improvement as long as the hadronic top is fully captured in a fat jet. The &-subjet selected based on 
kinematics is usually identified correctly. Relying on 6-tagging inside the kinematic algorithm is hurt by 
the combinatorial mis-tagging efficiency of 3 x e™'^ while there is no such factor 3 for the signal. This 
disadvantage is hard to compensate by improved mass plane cuts. To improve S/B, the best approach is to 
use the 6-tag only for the most probable 6-subjet and simply add it to the kinematic tagging algorithm. 



VII. OUTLOOK 



In this paper we have proposed and tested several modifications to kinematic top tagging, as implemented 
in the HEPTopTagger. As a starting point we have shown that provided the top quark is boosted enough 
to be collected inside a fat jet the usual kinematic criteria, i.e. a search for mass drops in the clustering 
history and the reconstruction of three independent invariant mass variables from the suspected top decay 
subjets, do not exhibit obvious shortcomings. 

1. One possible improvement, curing for example the decreasing efficiency of C/A based taggers towards 
larger boost is a switch to the kr algorithm once we identified the main subjets using mass drops. It 
keeps the tagging efficiency relative to the number of tops caught inside a fat jet on a plateau over the 
entire range pT.t ^ 150 — 600 GeV. 

2. Using pruning in combination with the usual filtering procedure we gain an additional kinematic 
variable. Cutting on it can almost double S/B relative to pure QCD backgrounds. However, this 



Normalized by hadronic top 

w/o factor £, L. , 

candidate 



tagged (b=j^) 



400 600 
p7 [GeV] 



Normalized by fat jets 
w/o factor e. 



include top 



tagged ; 
with b-tas 



400 600 
[GeV] 



R=1.5, C/A, with b-tag 




h 




w/o factor eT^ ~\~ , 








^ +^jets_ 






QCD 




■ ^ [ +T- 























pf [GeV] 



Figure 11: Left: tagging efficiency of the modified algoritlim as a function of pT,t and assuming et ~ 1. Center: 
tagging efficiency of tlie modified algorithm as a function of the pt of the fat jet. Right: mis-tagging rate of the 
modified algorithm as a function of the pr of the fat jet for QCD (blue) and W^-l-jets samples (green). Again, we 
omit e™"^. In all panels dotted lines show the HEPTopTagger results as a reference. 
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Figure 12: Left: tagging efficiencies normalized to the number of hadronic tops as a function of the pr for the 
fully hadronic tt sample. Center: efficiencies for type 1 (red), type 2 (green) and type 3 (blue) as functions of the fat 
jet pt for the fully hadronic ti sample. The dotted lines show the corresponding candidate fractions. Right: fraction 
of type 1, type 2 and type 3 only for fat jets including a top. 



improvement should be taken with a grain of salt until it can be confirmed by a proper detector 
simulation in the presence or pile-up. 

3. To extend the top tagger to lower boost we can increase the original size of the fat jet from R — 1.5 to 
R = 1.8. Indeed, the efSciency for low transverse momenta increases, but as well does the background 
combinatorics. We find that neither S/B nor S/^/B strongly benefit from this modification. 

4. Including a 6-tag as an additional step in the top tagging procedure can very significantly enhance the 
background rejection. This is well known from Higgs tagging. We also find that for the HEPTopTag- 
GER including such a 6-tag in a modified algorithm is not promising. 

At a time when we can expect the first officially tagged top quarks at the LHC these studies can guide us 
towards possible modifications and improvements of top taggers with different analyses in mind. They show 
that for example in the case of the HEPTopTagger there is still room for adjustment but that top taggers 
in general have within a few years reached an impressive level of maturity and reliability. 
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